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Abstract 

We apply a recently developed adaptive algorithm that systematically improves the efficiency 
of parallel tempering or replica exchange methods in the numerical simulation of small proteins. 
Feedback iterations allow us to identify an optimal set of temperatures/replicas which are found to 
concentrate at the bottlenecks of the simulations. A measure of convergence for the equilibration 
of the parallel tempering algorithm is discussed. We test our algorithm by simulating the 
36-residue villin headpiece sub-domain HP-36 where we find a lowest-energy configuration with a 
root-mean-square-deviation of less than 4 A to the experimentally determined structure. 
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I. INTRODUCTION 



Understanding the folding of proteins from computer simulations is a longstanding but 
still elusive goal in computational biology. The difficulties stem from the fact that proteins 
are only marginal stable. At room temperature the free energy difference between the 
biologically active and unfolded states is only of order ^ 10 Kcal/mol. However, this small 
gap is due to cancelations of large energetic and entropic terms which poses two major 
challenges to numerical simulations. On the one hand, one has to find a universal model that 
captures this delicate balance. On the other hand, the competing interactions necessarily 
lead to a rugged energy landscape that makes the exhaustive sampling of low-temperature 
configurations a challenging computational task. In general, it has been hard to distinguish 
which of the two difficulties is the limiting factor in computational protein studies. 

In this paper we address the second challenge and apply a powerful sampling technique 
that allows to efficiently explore a complex energy landscape by systematically shifting com- 
putational resources towards the bottlenecks of a simulations*^, which are typically in the 
vicinity of free energy barriers. We test our algorithm by simulating the 36-residue villin 
headpiece sub-domain HP-36. This molecule has raised considerable interest in computa- 
tional biology^ as it is one of the smallest proteins with well-defined secondary and tertiary 
structure^ but at the same time with 596 atoms still accessible to simulations^. Its structure 
which was resolved by NMR analysis and deposited in the Protein Data Bank (PDB code 
Ivii) is shown in Fig. 1. 

Recent computationally intensive investigations have studied this protein using molec- 
ular djTiamica^ and parallel tempering^ techniques. While the former study reports room 
temperature configurations that are within < 4.0 A to the native structure, these randomly 
sampled configurations could not be singled out from the misfolded structures in a rigorous 
way. The latter study-S tries to identify the biologically active state as those configurations 
which minimize the energy functional of an implicit solvent model. However, despite con- 
siderable long simulation times low-energy configurations that resemble the experimentally 
determined one were found only with less than 20% frequency at T=250 K. These con- 
formers differed still by root-mean-square-deviations (RMSD) of ~ 4 — 6A from the native 
structure and could also not be distinguished by their energies from that of the predominant 
misfolded structure. What leads to the discrepancy with the experiments? The authors of 



2 



Ref. |8| argue that it is due to poor approximations of the simulated force field and especially 
the implicit solvent model. Indeed, configurations with an RMSD of ~ 4 A have been found 
later with high frequency in simulations with a modified energy function^. However, the al- 
terations of the implicit solvent model are ad hoc and not universal^'^, while the parameters 
of the original model were fitted against experimental data. On the other hand, the data 
of Ref. y could also indicate that despite large computational efforts the simulation has not 
thermalized and the correct equilibrium distribution of low-energy configurations not yet 
been found. 

Deciding between the two alternatives in the above example is especially important as 
parallel tempering — (also known as replica exchange method) has recently become the 
simulation technique of choice in protein studiesi^ii^^. The question can be re-formulated as 
how does one gauge the efficiency of a parallel tempering run and ensures that the sampling 
is sufficiently long to ensure thermal equilibration? The present paper describes a measure 
for this purpose and discusses a protocol that allows one to optimize the performance of 
parallel tempering runs by finding the best temperature distribution. Using the enhanced 
parallel tempering protocol we demonstrate that the simulation of Ref. had indeed not 
thermalized. On the contrary, we now find a dominant lowest energy configuration that is 
within < 4.0 A to the native structure. This RMSD is comparable to the best ones found 
in previous molecular dynamics simulations^ with a different search technique and energy 
function, but our approach requires only 1% of their computational resources. 

II. ALGORITHM DESIGN 

In parallel tempering^^ simulations non-interacting copies, or "replicas", of the protein 
are simultaneously simulated at a range of temperatures {Ti, T2, . . . , T/v}, e.g. by distribut- 
ing the simulation over A^ nodes of a parallel computer. After a fixed number of Monte 
Carlo sweeps (or a molecular dynamics run of a certain time) a sequence of swap moves, 
the exchange of two replicas at neighboring temperatures, Tj and Tj+i, is suggested and 
accepted with a probability 

p{Ei, T, ^ Ei+i, Ti+i) = min (1, exp(-A/5AE)) , (1) 
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where Af3 = — 1/Tj is the difference between the inverse temperatures and AE = 

-Ej+i — Ei is the difference in energy of the two rephcas. For a given rephca the swap 
moves induce a random walk in temperature space that allows the replica to wander from 
low temperatures, where barriers in a complex energy landscape lead to long relaxation 
times, to high temperatures, where equilibration is rapid, and back. The convergence of a 
parallel tempering run is given by the relaxation at lowest temperature and can be gauged by 
the frequency of statistically independent visits at this temperature. A lower bound for this 
number is the rate of round-trips n^j between the lowest and highest temperature, Ti and T/v 
respectively. An equivalent measure is the round-trip time tin, i-e. the average time it takes 
a replica to move from lowest temperature Ti to the highest temperature T^r, and back. It 
is this non-local measure in temperature space that one has to minimize in order to optimize 
a parallel tempering simulationiA^. Commonly, it is assumed that equilibration is fastest 
if the local acceptance rate of swaps is the same for all pairs of neighboring temperatures 
Ti and j^_|_jiLiLiSii2iS£. Recently, it has been shown that this assumption is misleading^. 
Here we review the algorithm outlined in Ref.— and apply it to systematically optimize the 
temperature set used in our simulations in such a way that for each replica the number 
of round-trips is maximized, and equilibration of the system at low temperatures thereby 
substantially improved. 

We illustrate this approach by an example parallel tempering simulation of the 36-residue 
protein HP-36 in an all-atom representation. The intramolecular interactions are described 
by the ECEPP/2 force field^l 

-E'ECEPP/2 = Ec + El J + Etor 

E 332qiqj x - / Aij Bij \ 

+ ^[/Kl±cos(nzeO) , (2) 
I 

where is the distance between the atoms i and j, is the /-th torsion angle, and energies 
are measured in Kcal/mol. The protein-solvent interactions are approximated by a solvent 
accessible surface term 

i 

Here Ai is the solvent accessible surface area of the i — th atom in a given configuration, 
and Gi a solvation parameter for the atom i. For the present investigation we use the 



parameter set OONS of Ref. |22. Our implementation is based on the software package 
SMMP (Simple Molecular Mechanics for Proteins)^ which allowed us to distribute the 
simultaneous simulation of = 20 replicas on a beowulf cluster with 2.2 GHz Opteron 
processors. The initial temperature distribution for these replicas is listed in Table H] A 
sequence of swap moves between neighboring temperatures is attemped after each Monte 
Carlo sweep where a sweep consists of a series of Metropolis tests for each of the dihedral 
angles. Note that the implementation of the force field differs slightly from the one in Ref. 
leading to (irrelevant) differences in the absolute energy values. 

Our approach to optimize the simulated temperature set is inspired by a recently intro- 
duced adaptive broad-histogram algorithn>i that maximizes the rate of round-trips in energy 
space by shifting additional weight toward the bottlenecks of the simulation and has been 
outlined in the context of classical spin models in Ref.—. The bottlenecks of the simulation 
are identified by measuring the local diffusivity of the simulated random walk. In the case 
of a parallel tempering run, where we simulate a random walk in temperature space, this 
quantity is calculated by adding a label "up" or "down" to the replica that indicates which 
of the two extremal temperatures, Ti or T/v respectively, the replica has visited most re- 
cently. The label of a replica changes only when the replica visits the opposite extremum. 
For instance, the label of a replica with label "up" remains unchanged if the replica comes 
back to the lowest temperature Ti, but changes to "down" upon its first visit to Tjy. For 
each temperature point in the temperature set {Ti} we record two histograms n^p{Ti) and 
'^down(^i)- Before attempting a sequence of swap moves we increment at temperature Tj that 
of the two histograms which has the label of the respective replica currently at temperature 
Tj. If a replica has not yet visited neither of the two extremal temperatures, we increment 
neither of the histograms. For each temperature point this allows us to evaluate the average 
fraction of replicas which diffuse from the lowest to the highest temperature as 

/(T) = """P^^^ . (4) 

'^down (T) 

In Fig. 121 this fraction is plotted for our parallel tempering simulations of HP-36 with an 
initial temperature distribution as listed in Table H] 

The so-labelled replicas define a steady-state current current from Ti to T/v that is pro- 
portional to the round-trip rate rirt and therefore independent of temperature. To first order 
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in the derivative this current is given by 

j = D{TMT)^, (5) 

where D{T) is the local diffusivity at temperature T and ri{T) is the probability distribution 
for a replica to reside at temperature T, where the temperature T is now assumed to be 
a continuous variable (and not limited to the points of the current temperature set). For 
a given temperature set we approximate this probability distribution with a step-function 
f]{T) = C/AT, where AT = Tj+i — Tj is the length of the temperature interval around 
temperature Ti <T < Tj+i in the current temperature set. The normalization constant C 
is chosen as 

Rearranging Eq. (jSJ gives a simple measure of the local diffusivity D{T): 

where we have dropped the normalization C and the constant current j. 

For the parallel tempering simulation of HP-36 this quantity is plotted in Fig. 121 The 
diffusivity shows a strong modulation along the simulated temperature range 250 — 1000 K, 
note the logarithmic scale of the ordinate. A pronounced minimum occurs around T 
490 K where the diffusivity is suppressed by two orders of magnitude in comparison to 
the temperature range below 350 K and above 600 K. This minimum in the diffusivity 
points to a severe bottleneck for the random walk in temperature space: replicas can move 
back and forth in temperature rapidly below and above this bottleneck, but experience 
a dramatic slowdown as they approach and pass through the temperature range around 
490 K. This behavior can be explained through a free energy barrier associated with a 
structural transition of the protein; the minimum in the local diffusivity is located slightly 
below the maximum of the specific heat at T ^ 500 K which is also plotted in Fig. 01 
For HP-36 in the ECEPP/2 force-field it has been shown that the position of this peak 
separates a high-temperature phase with extended unordered configurations from a low- 
temperature region that is characterized by high helical content of the molecule^. Below 
this transition a shoulder in the measured local diffusivity points to a second bottleneck 
in the simulation for an extended range of temperatures 350 K < T < 490 K, possibly 
caused by competing low-energy configurations with high helical content. While the specific 



heat for this temperature range is shghtly larger than in the high-temperature region above 
600 K, there is no characteristic feature similar to the progression of the local diffusivity. 
The local diffusivity is thus a more sensitive measure to identify bottlenecks in a parallel 
tempering simulation and to locate the multiple temperature scales dominating the folding 
process of a protein for a given force field. 

In order to speed up equilibration we want to maximize the rate of round-trips which 
each replica performs between the two extremal temperatures, or equivalently the diffusive 
current j, by varying the temperature set {Tj} and thus the probability distribution ri(T), as 
discussed in Refs."^. Rearranging and integrating Eq. © this goal is achieved by minimizing 
the integral 



where we have added a Lagrange multiplier A which ensures that f]{T) remains a normalized 
probability distribution. Varying the probability distribution 1]{T) the integrand in Eq. (jHJ 
is minimized for 



where the normalization C is again chosen according to the normalization condition in 
Eq. ® . For the optimal temperature set the temperature points are thus rearranged in such 
a way that the probability distribution r/(°P*) (T) becomes inversely proportional to the square 
root of the local diffusivity. Measuring the local diffusivity D{T) for an initial temperature 
set, we can determine the optimized probability distribution ry(°P*)(T) approximated as a 
step-function in the original temperature set. The optimized temperature set {T/} is then 
found by choosing the n-th temperature point such that 



where 1 < n < and the two extremal temperatures T[ = Ti and = T/v remain fixed. 
This feedback of the local diffusivity is then iterated for increasingly long simulation runs 
- in our simulations we double the number of swaps for subsequent feedback steps - until 
convergence of the optimized temperature set is found. 

In our simulations we start with the arbitrary initial temperature set of Table H] that sim- 
ilar to a geometric progression concentrates temperature points at low temperatures. Three 
feedback steps were performed, one after 100,000 MC sweeps, a second after further 200,000 
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sweeps, and a third one after additional 400,000 sweeps. The iterated temperature sets are 
plotted in Fig. |3] and also listed in Table IH The feedback algorithm shifts computational 
resources towards the temperature of the helix-coil transition and temperature points in 
the optimized temperature sets concentrate around T ^ 500 K where the measured local 
diffusivity is suppressed, see Fig. El In the derivation of the feedback procedure we have 
assumed that the local diffusivity is to leading order independent from the temperature set. 
A posteriori we can verify this assumption by demonstrating that the optimized temperature 
set is independent of the initial temperature set. To this end, we perform a second series of 
feedback optimization steps starting from the temperature set of Ref. 0. As illustrated in 
the lower half of Fig. HI we indeed find that a very similar distribution is approached, 
i 

For the optimized temperature set the acceptance probabilities of replica swaps show 
a strong temperature dependence as illustrated in Fig. |S| This is a consequence of the 
concentration of temperature points around T ^ 500 K in the optimized temperature set 
for HP-36. There the acceptance probabilities are found to be relatively high (around 
80%) while in the temperature regions below 350 K and above 600 K where temperature 
points have been thinned out the acceptance probabilities drop below some 40%. The fact 
that for our optimized temperature set the acceptance probabilities vary with temperature 
contradicts various alternative approaches in the literaturei^*i2»iS»i2i2S that aim at maximizing 
equilibration by choosing a temperature set where the acceptance probability of attempted 
swaps is independent of temperature. 



III. SIMULATION RESULTS - 



The feedback-iterations systematically optimize the temperature set which maximize the 
efficiency of parallel tempering simulations. We now turn to the results obtained for our 
simulations of HP-36 and discuss the effects of the temperature reweighting. Though the 
parallel tempering simulations allow to evaluate thermodynamic quantities over a range of 
temperatures, here we focus on the properties of the configurations at the lowest temper- 
atures. In Fig. ini the radius of gyration Rgy which measures the compactness of a protein 
configuration is plotted for the lowest-energy configuration versus the number of Monte 
Carlo sweeps. For the initial iteration the radius of gyration varies in a broad range of 
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10 — 14 A. A histogram of Rgy is plotted on top of the time series in Fig. showing that 
two sets of configurations are found, one set with "compact" configurations characterized 
by a radius of gyration in the range 10 — 11 A and "extended" configurations with a radius 
of gyration in the range 12 — 14 A. Averaging over some 100,000 MC sweeps in the first 
iteration we find that about 25% of the configurations are "compact", and a remaining 75% 
of "extended" configurations. Previous simulations^ with a total of 150,000 MC sweeps also 
reported the occurrence of these two sets of configurations. Similar to our case the "ex- 
tended" configurations dominated, and only a small fraction of 20% of the configurations 
were "compact". In the present study we continued the simulation after the first feedback 
step with an optimized temperature set for an additional 200,000 MC sweeps. The time 
series in Fig. E] shows that as a consequence, the fraction of "extended" configurations in the 
lowest-energy configurations is significantly reduced and some 90% of the sampled lowest- 
energy configuration have a radius of gyration smaller than 11 A. This ratio increases further 
to 99% for the final iteration with 400,000 MC sweeps after the second feedback step. While 
in the previous study equilibration at low temperatures was determined by analyzing the 
time series for thermodynamic observables such as the potential energy and convergence 
was found after some 100,000 MC sweeps, the discrepancy to the results presented here 
cast serious doubt whether an overall simulation time of some 150,000 MC sweeps and a 
sub-optimal temperature set were sufficient to reach full equilibration. The long relaxation 
times in our example indicate that even with a sophisticated technique like parallel temper- 
ing the simulation times have to be considerably longer than commonly assumed. In order 
to assure equillibrisation at lowest temperature the number of round trip times should be 
at least Urt ~ 10. 

To probe whether our simulations allow a structural prediction of the true ground state 
configuration we track the configuration with the overall lowest energy in the simulation and 
compare it to the Protein Data Bank structure of IIP-36 (PDB code Ivii). The lowest-energy 
configuration obtained in our simulation is illustrated in Fig. [3 Despite the fact that in 
this structure the two N-terminal helices merged to one long helix (compromising residues 
5 to 21) that tightly packs to the C-terminal helix, its RMSD to the PDB structure is only 
?"rmsd = 3.7 A. This value is substantially lower than in the structures with an RMSD 
of trmsd ~ 5.8 — 6.0 A previously obtained by molecular dynamics simlations^, Monte 
Carlo simulations^ and optimization techniques^. A structure with comparable RMSD of 
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'"RMSD ~ 3.5 A has been obtained by large-scale molecular dynamics simulations^. However, 
in those simulations the best-matching structure was found by comparing all sampled con- 
figurations along multiple trajectories to the PDB structure, while in our simulations the 
optimal structure is singled out as the one with the lowest energy. In addition, our simu- 
lations consumed only about 1% of the computing time resources (about 1,000 cpu years) 
used in Ref. |7|. 

IV. CONCLUSIONS 

In conclusion, we have applied a powerful feedback algorithm for the numerical simulation 
of proteins that allows to allocate computational resources in a parallel tempering simulation 
so that equilibration at all temperatures is considerably improved. By tracking the diffusion 
of replicas in temperature space we have identified the bottlenecks of a simulation, typically 
in the vicinity of the folding transition. Feeding back this information we obtain an optimal 
temperature set that concentrates temperature points at these bottlenecks. Our algorithm 
differs from previous approaches that aim at maximizing equilibration by considering the 
local acceptance probabilities of replica exchange moves. In contrast we find that for the 
optimal temperature set acceptance probabilities for such swap moves show a strong temper- 
ature dependence. Applying the optimized parallel tempering technique to the simulation 
of the 36-residue protein HP-36 we find a dominant low-energy configuration with less than 
4 A root-mean square distance from the native structure within a fraction of the computing 
time consumed by high-performance molecular dynamics simulations. 

We note, however, that the energy difference between our compact, lowest-energy con- 
figuration and the extended structure with lowest energy - which differs from the PDB 
structure by an RMSD of 8.0 A- is only ^ 10 kcal/mol (for the minimized configurations). 
On the other hand, the energy of our lowest-energy configuration is 100 kcal/mol lower than 
that of the (minimized) PDB structure from which (despite the small RMSD) it still differs 
considerably. Hence, while our results appear to be closer to the experimental results than 
previous simulations they still demonstrate the limitations on protein simulations that are 
inherent in present energy functions. The extremely long relaxation times indicate the exis- 
tence of spurious minima that should be absent in the folding funnel of fast folding proteins 
such as the villin headpiece. Unveiling these limitations in the energy functions and their 
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underlying causes requires optimized simulation techniques such as the one applied in the 
present paper. 
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TABLE I: Temperature sets used in the parallel tempering simulation of HP-36. Applying the 
feedback algorithm temperature points in the optimized temperature sets (iterations 2,3 and 4) 
concentrate around the helix-coil transition around 500 K. The temperature sets are also illustrated 
in Fig. m 



iteration 


temperature set [K] 


1 


250 275 300 325 350 375 400 425 450 500 550 600 650 700 750 800 850 900 950 1000 


2 


250 295 326 349 371 395 424 446 464 482 499 514 528 543 559 577 595 628 693 1000 


3 


250 326 359 385 411 434 452 467 480 491 501 510 519 527 536 546 560 578 619 1000 


4 


250 314 358 381 402 423 444 461 474 484 494 502 511 519 529 540 554 576 670 1000 
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FIG. 1: (color online) NMR-derived structure of the 36-residue peptide HP-36 as deposited in the 
Protein Data Bank (Ivii). 

FIG. 2: Fraction of replicas diffusing from the lowest, Ti = 250 K, to the highest temperature 
Tfyf = 1000 K in a parallel tempering simulation of HP-36. For the optimized temperature set 
(iteration 3) , the temperature points are distributed in such a way that the fraction shows a nearly 
constant decay A/j = /(Tj) — /(Tj+i) = 1/{N — 1) between adjacent temperature points, i.e. 
A/j(Ar - 1) const. 

FIG. 3: Local diffusivity D(T) (left ordinate) for a random walk in temperature space preformed 
by a replica in a parallel tempering simulation of the chicken villin headpiece subdomain HP-36 
using the ECEPP/2 force field and an implicit solvent. The diffusivity shows a strong modulation 
with temperature, note the logarithmic scale of the left ordinate. A pronounced minimum in the 
local diffusivity occurs slightly below the helix-coil transition around T 500 K where the specific 
heat Cy(T) (right ordinate) has a maximum (dashed line). 

FIG. 4: (color online) Optimized temperature sets for a parallel tempering of HP-36 obtained 
by the feedback algorithm for two different initial temperature sets. Independent of the initial 
temperature set the optimized temperature sets converge to a temperature set that concentrates 
temperatures in the vicinity of the helix-coil transition temperature around T !=a 500 K (dashed 
line) . 
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FIG. 5: Acceptance probabilities (open squares) of replica swaps in a parallel tempering simulation 
of HP-36 using the optimized temperature set illustrated in Fig. 0] The dependence of the accep- 
tance probabilities on the temperature closely reflects the shape of the measured local diffusivity 
(filled circles). In the vicinity of the helix-coil transition temperature T ~ 500 K where the local 
diffusivity is strongly suppressed the acceptance probabilities are highest due to the contraction of 
temperature points in the optimized temperature set. The dotted lines indicates the minimum in 
the local diffusivity. 



FIG. 6: (color online) Radius of gyration of the lowest-energy configuration in a parallel tempering 
simulation of IIP-36 versus the number of Monte Carlo sweeps. The dashed lines indicate when 
the temperature set used in the simulation was redefined as illustrated in the upper half of Fig. ^ 
The insets show histograms of the radius of gyration for the three simulation parts. 



FIG. 7: (color online) Lowest energy structure of HP-36 as obtained by an all-atom Monte Carlo 
simulation using the ECEPP/2 force field and an implicit solvent. The root-mean square deviation 
of this structure to the PDB structure shown in Fig. [T] is trmsd = 3.7 A. 
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FIG. 1: (color online) NMR-derived structure of the 36-residue peptide HP-36 as deposited in the 
Protein Data Bank (Ivii). 



16 




200 300 400 500 600 700 800 900 1000 

temperature T [K] 

FIG. 2: Fraction of replicas diffusing from the lowest, Ti = 250 K, to the highest temperature 
Tjv = 1000 K in a parallel tempering simulation of HP-36. For the optimized temperature set 
(iteration 3), the temperature points are distributed in such a way that the fraction shows a nearly 
constant decay A/j = /(Tj) — /(Tj+i) = 1/{N — 1) between adjacent temperature points, i.e. 
A/i(A^ - 1) ?a const. 
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FIG. 3: Local difFusivity D(T) (left ordinate) for a random walk in temperature space preformed 
by a replica in a parallel tempering simulation of the chicken villin headpiece subdomain HP-36 
using the ECEPP/2 force field and an implicit solvent. The diffusivity shows a strong modulation 
with temperature, note the logarithmic scale of the left ordinate. A pronounced minimum in the 
local diffusivity occurs slightly below the helix-coil transition around T ~ 500 K where the specific 
heat Cv{T) (right ordinate) has a maximum (dashed line). 



18 



o 

u 



2- 
3- 
4- 



I 



arbitrary initial temperature set 



0O©O©OO0© G) © © © O © O O O 

I 
I 

X xxxxxxxxx>dxxxxxxx X X 

I 
I 

X X X X X X XXXX)b000<XX XX X 

I 



I 



xxxilooo<x 



X X X X X X X 



I 

X X X X X X X xx>«ocbo<x X X 

I 
I 

X X X X X xxMoedsoocx x x 

I 
I 

OOQQOOOO 0000'© O © O 



temperature set used in Ref. [6] 
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FIG. 4: (color online) Optimized temperature sets for a parallel tempering of HP-36 obtained 
by the feedback algorithm for two different initial temperature sets. Independent of the initial 
temperature set the optimized temperature sets converge to a temperature set that concentrates 
temperatures in the vicinity of the helix-coil transition temperature around T ~ 500 K (dashed 
line). 
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FIG. 5: Acceptance probabilities (open squares) of replica swaps in a parallel tempering simulation 
of HP-36 using the optimized temperature set illustrated in Fig. 0] The dependence of the accep- 
tance probabilities on the temperature closely reflects the shape of the measured local diffusivity 
(filled circles). In the vicinity of the helix-coil transition temperature T ~ 500 K where the local 
diffusivity is strongly suppressed the acceptance probabilities are highest due to the contraction of 
temperature points in the optimized temperature set. The dotted lines indicates the minimum in 
the local diffusivity. 
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FIG. 6: (color online) Radius of gyration of the lowest-energy configuration in a parallel tempering 
simulation of HP-36 versus the number of Monte Carlo sweeps. The dashed lines indicate when 
the temperature set used in the simulation was redefined as illustrated in the upper half of Fig. \^ 
The insets show histograms of the radius of gyration for the three simulation parts. 
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FIG. 7: (color online) Lowest energy structure of HP-36 as obtained by an all-atom Monte Carlo 
simulation using the ECEPP/2 force field and an implicit solvent. The root-mean square deviation 
of this structure to the PDB structure shown in Fig. ^is trmsd = 3.7 A. 
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